VIEWs are an undervalued and underused feature in SQL. They basically consist of a query that has been given a name, and a parameter list, so can be used like an inline macro. Technically, you’re supposed to think of it as a virtual table.
The idea of an inline macro or inline function goes way back to the first versions of Fortran and later BASIC. Depending on the product, the syntax might look like this: FN DOUBLE (N) = (N + N)
. You had to give the name of the Macro, a simple optional parameter list, assignment operator, usually an =, and the text of the computation. It was often important to put parentheses around the body of the macro, or to have a compiler that would do this for you.
This inline feature was underused in most of the languages that had it. After all, you are already using a text editor to input your code, so why not just put the code for the macro into the program text, using a “cut and paste” in your editor?
Basic Syntax for a CREATE VIEW Statement in SQL Server
The basic syntax for a CREATE VIEW
statement is not as straightforward as you might think. There are two options for the VIEW
:
- Views that are recursive,
- Views that have a check option.
We’ll get to those in a minute, but the basic syntax is:
1 2 3 4 |
CREATE VIEW <VIEW name> [(<VIEW column list>)] [WITH RECURSIVE <table name (<VIEW column list>)] AS <Query expression>; |
Ignore the options involved with a recursive VIEW
. We will discuss that separately, but a recursive view is almost never used in practice, because the recursive queries in Standard SQL are expensive and confusing. They are generally better handled with nested set models in the DDL (Data Definition Language), but if you must be able to take anything that generates a table and use it as the query expression in a VIEW
.
It’s always a good idea to provide a list of column names for the VIEW
, even if they just repeat the column names from the query. If anything subsequently changes, this is the best place to update the view without needing to fiddle with base tables. The column list will also allow you to change the column names in the view as necessary. For example, given a Personnel table, there might be a logical distinction between a regular, old employee, and employees who belong to a special group:
1 2 3 4 5 6 |
CREATE VIEW NYC_Team (emp_id, emp_name, city_name AS metro_nyc., ..) AS SELECT emp_id, emp_name, office_city_name, ... FROM Personnel WHERE location_name IN (“New York”, “New Jersey”, “Brooklyn”, “Bronx”, “Manhattan”); |
If you look at the internal schema DDL in most SQL products, you will find that it contains a fair number of relatively simple VIEWs that are assembled from base tables. This is for human readability and to make sure that everybody uses the same definition for reporting purposes.
Why are VIEWS Useful?
Views are useful because they let you modify an application from a single location. I wish it had always been like that. Decades ago, pre-SQL, I wrote a scheduling program for a friend who needed to have wait-staff at his restaurant. A problem that had to be resolved was that the age to drink and to serve alcohol had just jumped from 18 to 21 years of age, and this change did not get correctly propagated in the BASIC programs that we were using.
Cruel fate arranged that we had underage servers on the night that the Alcoholic Beverage Control people decided to run a surprise inspection…
Types of VIEWs
Nested VIEWs
You can put VIEWs inside other VIEWs. How deeply you can nest them depends on the product you’re using. The only rule is that each macro has to exist before it can be used. You might think this is a bit fussy, but in one of the very early SQL products, you could mistakenly write…
1 2 3 4 |
-- (wrong!) CREATE VIEW Foobar AS SELECT * FROM Foobar; |
This would hang the compiler in an endless loop. Essentially, it would try to resolve the FROM
clauses that nested from here to infinity, as ‘Foobar’ Is invoked endlessly. This led to the rule that tables referenced in the VIEW
must completely exist, not just be “in process”, and thereby appear, prematurely, in a symbol table.
A related problem is caused by declaring circular references. VIEW A depends on VIEW B, VIEW B references VIEW C and finally VIEW C is defined with VIEW A. This ends badly.
Union VIEWs
This is a typical pattern in VIEWs. The query that defines the VIEW
is a UNION
, rather than a simple SELECT
. The skeleton looks something like:
1 2 3 4 5 6 7 8 9 10 11 |
CREATE VIEW European_Sales (country_code, sales_total) AS (SELECT country_code, country_sales FROM Albania_Sales UNION SELECT country_code, country_sales FROM Bulgaria_Sales UNION … SELECT country_code, country_sales FROM Yugoslavia_Sales); |
In the SQL Standards, the result of a UNION
expression has no column names unless you explicitly create these names with a syntax as follows:
1 2 |
“(<UNION expression>) AS <table name> (<column name list>)” |
In practice, some products get the default column names from the first query construct in the UNION
, while others look at the last UNION
construct. There is also a rule that the statements in the list being unioned must be “union compatible”; this concept appears in other places in the language, but the idea is that columns in the results have to be at the same data type and same position. The intention is that you can glue these lists together easily. Today, they don’t have to be an exact match in most products, but are implicitly cast to a single, higher-level data type.
Remember that the SELECT
clause can also rename columns from the base tables. If you’re going to do this with a VIEW, you would assign the names in the CREATE VIEW
clause. This puts the information you need in one place. Rules for alias names in a query are trickier than you would first think; it might be worth an article on its own, but, for the time being, just remember these rules.
Computational VIEWs
How many kgs in 1 lbs? Let me save you the trouble of doing a Google search: 1 pound = 0.45359237 kilograms, to 9 decimal places. I think I can say with some certainty that this number does not roll off your tongue easily. I’m sure there’s a few nerds out there who know pi to at least 10 decimal places because that was the precision on of calculators in the late 1960s.
These conversion constants can be hidden in VIEWs, a convenience so that you can get to reports and queries easily in one system of measurements, but how many decimal places do you want to use? As a general rule, I would recommend carrying it out to what looks like a large number of decimal places, say two more than you think you’re going to need. This is because you can always round or truncate the results to get a VIEW with meaningful precision for your specific application, but you cannot reliably pad out decimal places. Most of the time, we are happy to use a factor of 2.2 pounds per kilogram because were more apt to work with groceries than with precious gems or physics experiments. I also must mention Celko’s law: “a number impresses people as the square of the decimal places, even if it is meaningless.”
Recursive VIEWs
Technically speaking, standard SQL allows for recursive queries. I’ve never seen anyone actually use them in production so I won’t go into any details. A recursive SQL query is a query that repeatedly calls itself to process hierarchical or self-referencing data structures, such as organizational charts, family trees, or parts explosions. Its execution is limited and expensive, so I would recommend that you use instead a nested sets model for such problems.
This construct consists of the CTE, which holds the results and is a recursive part. The other part is the anchor. The anchor is a base table or table expression used to define starting point.
The Common Table Expression (CTE) is a temporary, named result set within an SQL query. The Recursive CTE is a special type of CTE that allows for recursion. It references itself within its definition.
You need to inform the compiler that the CTE is going to be recursive in standard SQL. Some commercial products can detect this. Here’s the basic skeleton:
1 2 3 4 5 6 7 8 9 10 11 12 |
WITH RECURSIVE cte_name [(<column_name_ list)] -- Anchor SELECT ... FROM <table_name> WHERE <anchor condition> UNION ALL -- Recursive CTE expression SELECT ... FROM <cte_name> -- Referencing the CTE itself WHERE <termination_condition> -- beware of infinite loops! ) SELECT * FROM cte_name; |
The WITH CHECK Clause
The WITH CHECK
clause appears at the end of the CREATE VIEW
statement.
A VIEW that is defined WITH CHECK
enforces any rows that are modified or inserted against the SELECT
statement for that VIEW
.
VIEWs with the WITH CHECK
option are also called symmetric VIEWs. For example, a symmetric VIEW
that only returns employees in the Sales department will not allow insertion of employees in other departments. The WITH CHECK
option forces the WHERE
clause to be checked before actions are done against the base tables that define the VIEW
. In other words, the WITH CHECK
option does not affect the result of a SELECT
statement.
Updatable VIEWS
Not all VIEWs can be updated. In fact, this has been a topic in relational theory for a very long time. Chris Date even wrote a whole book on this topic: VIEW Updating and Relational Theory: Solving the VIEW Update Problem (ISBN-13: 978-1449357849).
In Standard SQL, a VIEW
is updatable if a row in the VIEW
maps exactly to one row in a base table. This means that there can be no computed columns, joins or set operations in the VIEW
.
However, there is a sneaky and complicated kludge to get around this limitation. Let’s say you have a VIEW
that has a column for the company’s total sales. Total sales are computed by adding up sales made by all of the branch offices. If you update the total sales, there is a business rule that you distribute new sales proportionately over the branches via some formula. You’re going to need to do arithmetic here.
The best way to do this is in a procedural language that’s designed to do arithmetic. However, you can do an INSTEAD OF trigger using math mixed with CASE expressions. You are essentially doing procedural coding with SQL declarative constructs! The advantage is that unlike a procedure call, you can’t avoid the trigger. Happily, none of this is explicitly exposed to the user.
Conclusion
A view allows the database to insulate the user from the complexities of accessing the base tables and can provide a convenient way to update a database as required. Although there is likely to be a performance cost, much of this can be avoided. If a VIEW
is used in many places in the same query, or is used in several queries in the same database, there’s a reasonable chance that a modern SQL engine will materialize it rather than recompute it each time. This can save execution time – just be careful that you don’t overdo nesting or query complexity in your views.
Load comments